-
Notifications
You must be signed in to change notification settings - Fork 2.6k
refactor: improve readFileTool XML output format #2340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Empty files should not have line numbers, but non-empty files with empty content at a specific line offset should. - If content is empty, return empty string for empty files - If content is empty but startLine > 1, return line number for empty content at that offset This ensures that the model does not think the file contains a single empty line. Signed-off-by: Eric Wheeler <[email protected]>
- Remove unnecessary XML indentation that could confuse the model - Separate file content from notices and errors using dedicated tags - Add line range information to content tags - Handle empty files properly with self-closing tags - Add comprehensive test coverage Fixes #2278 Signed-off-by: Eric Wheeler <[email protected]>
- Always display line numbers in non-range reads - Improve XML formatting with consistent newlines for better readability Signed-off-by: Eric Wheeler <[email protected]>
- Update test expectations to match the new XML format with newlines - Update tests to expect line numbers attribute in content tags - Modify test assertions to check for the correct line range values Signed-off-by: Eric Wheeler <[email protected]>
- Add newline to all output - Handle trailing newlines and empty lines consistently - Add test cases for blank lines: - Multiple blank lines within content - Multiple trailing blank lines - Only blank lines with offset - Trailing newlines Signed-off-by: Eric Wheeler <[email protected]>
- Modified extract-text mock to preserve actual addLineNumbers implementation - Removed mock implementation of addLineNumbers - Updated test data to account for trailing newline - Removed unnecessary mock verification Signed-off-by: Eric Wheeler <[email protected]>
- Replace direct mocking of addLineNumbers with spy on actual implementation - Add verification to ensure the real function is called when appropriate - Add skipAddLineNumbersCheck option for cases where function should not be called - Update test cases to use appropriate verification options - Fix numberedFileContent to include trailing newline for consistency Signed-off-by: Eric Wheeler <[email protected]>
|
This turned into a bit of a wild goose chase when I discovered that content is not always being read correctly particularly with multiple carriage returns at the end of a file. This has been corrected as part of this pull request in 8c9caa2 I believe this is ready for review and merge. |
|
one more test to fix ... |
- Direct data processing provides more accurate results by preserving exact content with carriage returns - Improved performance through minimal buffering and efficient string operations - Use string indexes to find newlines while maintaining their original format - Handle all edge cases correctly with preserved line endings - Add tests for various edge cases including empty files, single lines, and different line endings Signed-off-by: Eric Wheeler <[email protected]>
|
Great! Will look soon |
Remove unused variable declaration to appease ellipsis-dev linter requirements. Signed-off-by: Eric Wheeler <[email protected]>
|
all tests are passing, this is ready for review |
Context
This PR improves the XML output format in readFileTool.ts to make it more structured and less prone to misinterpretation by the model.
Implementation
readLines()Test Examples
Example 1: Definitions Only (maxReadFileLine=0)
Example 2: Content with Line Range
Example 3: Empty File
Example 4: Error Handling
How to Test
Get in Touch
Discord: KJ7LNW
Fixes #2278
cc: @hannesrudolph
Important
Refactor
readFileTool.tsto improve XML output structure, add error handling, and enhance test coverage.readFileTool.tsto improve structure and clarity.read-file-maxReadFileLine.test.ts,read-file-xml.test.ts, andextract-text.test.ts.maxReadFileLinevalues, binary files, and range parameters.addLineNumbers()inextract-text.tsto handle trailing newlines and empty content correctly.readLines()inread-lines.tsto process data chunks directly and handle edge cases.This description was created by
for 1c7d1b24f260be734f7e4c9c227346375348d8f1. It will automatically update as commits are pushed.